End-to-end acoustic modelling for phone recognition of young readers

نویسندگان

چکیده

Automatic recognition systems for child speech are lagging behind those dedicated to adult in the race of performance. This phenomenon is due high acoustic and linguistic variability present caused by their body development, as well lack available data. Young readers’ additionally displays peculiarities, such slow reading rate presence mistakes, that hardens task. work attempts tackle main challenges phone modelling young with limited data improve understanding strengths weaknesses a wide selection model architectures this domain. We find transfer learning techniques highly efficient on end-to-end adult-to-child adaptation small amount Through learning, Transformer complemented Connectionist Temporal Classification (CTC) objective function, reaches error 28.1%, outperforming state-of-the-art DNN–HMM 6.6% relative, other more than 8.5% relative. An analysis models’ performance two specific tasks (isolated words sentences) provided, showing influence utterance length attention-based CTC-based models. The Transformer+CTC an ability better detect mistakes made children, which can be attributed CTC function effectively constraining attention mechanisms monotonic.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

End-to-End Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition with TensorFlow

This article discusses strategies for end-to-end training of stateof-the-art acoustic models for Large Vocabulary Continuous Speech Recognition (LVCSR), with the goal of leveraging TensorFlow components so as to make efficient use of large-scale training sets, large model sizes, and high-speed computation units such as Graphical Processing Units (GPUs). Benchmarks are presented that evaluate th...

متن کامل

Comparison of nerve repair with end to end, end to side with window and end to side without window methods in lower extremity of rat

Abstract Background : Although, different studies on end-to-side nerve repair, results are controversial. The importance of this method in case is unavailability of proximal nerve. In this method, donor nerves also remain intact and without injury. In compare to other classic procedures, end-to-side repair is not much time consuming and needs less dissection. Overall, the previous studies i...

متن کامل

End-to-end esophagojejunostomy versus standard end-to-side esophagojejunostomy: which one is preferable?

Abstract Background: End-to-side esophagojejunostomy has almost always been associated with some degree of dysphagia. To overcome this complication we decided to perform an end-to-end anastomosis and compare it with end-to-side Roux-en-Y esophagojejunostomy. Methods: In this prospective study, between 1998 and 2005, 71 patients with a diagnosis of gastric adenocarcinoma underwent total gastrec...

متن کامل

End-to-End Trust Starts with Recognition

Pervasive computing requires some level of trust to be established between entities. In this paper we argue for an entity recognition based approach to building this trust which differs from starting from more traditional authentication methods. We also argue for the concept of a “pluggable” recognition module which allows different recognition schemes to be used in different circumstances. Fin...

متن کامل

End-to-end Audiovisual Speech Recognition

Several end-to-end deep learning approaches have been recently presented which extract either audio or visual features from the input images or audio signals and perform speech recognition. However, research on end-to-end audiovisual models is very limited. In this work, we present an end-toend audiovisual model based on residual networks and Bidirectional Gated Recurrent Units (BGRUs). To the ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Speech Communication

سال: 2021

ISSN: ['1872-7182', '0167-6393']

DOI: https://doi.org/10.1016/j.specom.2021.08.003